Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Apprentissage et optimisation de politiques pour un bras articulé actionné par des muscles

Identifieur interne : 001693 ( Main/Exploration ); précédent : 001692; suivant : 001694

Apprentissage et optimisation de politiques pour un bras articulé actionné par des muscles

Auteurs : Didier Marin [France] ; Lionel Rigoux [France] ; Olivier Sigaud [France]

Source :

RBID : Pascal:13-0216766

Descripteurs français

English descriptors

Abstract

Many research works combine learning from demonstration and policy improvement methods to learn the controller of a robot along a specific trajectory. Nevertheless, a capability to learn in the whole reachable space of this robot is missing in these works. In this paper we propose a method that consists in learning a reactive near-optimal feedback controller in two steps. First, an efficient parametric feedback controller is obtained from learning from Demonstration based on the trajectories computed by a costly near-optimal controller. Second, the feedback controller is optimized further with direct Policy Search methods. As a result, we obtain a controller that is executed 20 000 times faster than the original controller for a similar performance. Our work is evaluated in simulation.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="fr" level="a">Apprentissage et optimisation de politiques pour un bras articulé actionné par des muscles</title>
<author>
<name sortKey="Marin, Didier" sort="Marin, Didier" uniqKey="Marin D" first="Didier" last="Marin">Didier Marin</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>Institut des Systèmes lntelligents et de Robotique UPMC-Paris 6, CNRS UMR 7222 4 place Jussieu</s1>
<s2>75252 Paris</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Rigoux, Lionel" sort="Rigoux, Lionel" uniqKey="Rigoux L" first="Lionel" last="Rigoux">Lionel Rigoux</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>Institut des Systèmes lntelligents et de Robotique UPMC-Paris 6, CNRS UMR 7222 4 place Jussieu</s1>
<s2>75252 Paris</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Sigaud, Olivier" sort="Sigaud, Olivier" uniqKey="Sigaud O" first="Olivier" last="Sigaud">Olivier Sigaud</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>Institut des Systèmes lntelligents et de Robotique UPMC-Paris 6, CNRS UMR 7222 4 place Jussieu</s1>
<s2>75252 Paris</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">13-0216766</idno>
<date when="2013">2013</date>
<idno type="stanalyst">PASCAL 13-0216766 INIST</idno>
<idno type="RBID">Pascal:13-0216766</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000062</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000945</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000038</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000038</idno>
<idno type="wicri:doubleKey">0992-499X:2013:Marin D:apprentissage:et:optimisation</idno>
<idno type="wicri:Area/Main/Merge">001709</idno>
<idno type="wicri:Area/Main/Curation">001693</idno>
<idno type="wicri:Area/Main/Exploration">001693</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="fr" level="a">Apprentissage et optimisation de politiques pour un bras articulé actionné par des muscles</title>
<author>
<name sortKey="Marin, Didier" sort="Marin, Didier" uniqKey="Marin D" first="Didier" last="Marin">Didier Marin</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>Institut des Systèmes lntelligents et de Robotique UPMC-Paris 6, CNRS UMR 7222 4 place Jussieu</s1>
<s2>75252 Paris</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Rigoux, Lionel" sort="Rigoux, Lionel" uniqKey="Rigoux L" first="Lionel" last="Rigoux">Lionel Rigoux</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>Institut des Systèmes lntelligents et de Robotique UPMC-Paris 6, CNRS UMR 7222 4 place Jussieu</s1>
<s2>75252 Paris</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Sigaud, Olivier" sort="Sigaud, Olivier" uniqKey="Sigaud O" first="Olivier" last="Sigaud">Olivier Sigaud</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>Institut des Systèmes lntelligents et de Robotique UPMC-Paris 6, CNRS UMR 7222 4 place Jussieu</s1>
<s2>75252 Paris</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Revue d'intelligence artificielle</title>
<title level="j" type="abbreviated">Rev. intell. artif.</title>
<idno type="ISSN">0992-499X</idno>
<imprint>
<date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Revue d'intelligence artificielle</title>
<title level="j" type="abbreviated">Rev. intell. artif.</title>
<idno type="ISSN">0992-499X</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Arm</term>
<term>Capability index</term>
<term>Direct method</term>
<term>Entropy</term>
<term>Feedback regulation</term>
<term>Muscle</term>
<term>Optimal control</term>
<term>Optimal control (mathematics)</term>
<term>Optimization</term>
<term>Policy</term>
<term>Robotics</term>
<term>Search algorithm</term>
<term>Space application</term>
<term>Stochastic control</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Robotique</term>
<term>Rétroaction</term>
<term>Politique</term>
<term>Bras</term>
<term>Muscle</term>
<term>Indice aptitude</term>
<term>Commande optimale</term>
<term>Optimisation</term>
<term>Application spatiale</term>
<term>Méthode directe</term>
<term>Algorithme recherche</term>
<term>Commande stochastique</term>
<term>Entropie</term>
<term>Contrôle optimal</term>
<term>.</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Robotique</term>
<term>Politique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Many research works combine learning from demonstration and policy improvement methods to learn the controller of a robot along a specific trajectory. Nevertheless, a capability to learn in the whole reachable space of this robot is missing in these works. In this paper we propose a method that consists in learning a reactive near-optimal feedback controller in two steps. First, an efficient parametric feedback controller is obtained from learning from Demonstration based on the trajectories computed by a costly near-optimal controller. Second, the feedback controller is optimized further with direct Policy Search methods. As a result, we obtain a controller that is executed 20 000 times faster than the original controller for a similar performance. Our work is evaluated in simulation.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Île-de-France</li>
</region>
<settlement>
<li>Paris</li>
</settlement>
</list>
<tree>
<country name="France">
<region name="Île-de-France">
<name sortKey="Marin, Didier" sort="Marin, Didier" uniqKey="Marin D" first="Didier" last="Marin">Didier Marin</name>
</region>
<name sortKey="Rigoux, Lionel" sort="Rigoux, Lionel" uniqKey="Rigoux L" first="Lionel" last="Rigoux">Lionel Rigoux</name>
<name sortKey="Sigaud, Olivier" sort="Sigaud, Olivier" uniqKey="Sigaud O" first="Olivier" last="Sigaud">Olivier Sigaud</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001693 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001693 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:13-0216766
   |texte=   Apprentissage et optimisation de politiques pour un bras articulé actionné par des muscles
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022